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Abstract  -  Aim:  This  pilot  study  was  designed  to  determine  the 
maximum  level  of  compression  of  digitised  Images  of  the 
eardrum  and  to  develop  assessment  protocols. 

Methods:  The  JPEG  algorithm  was  used  to  compress  fifteen 
1.44MB  images  to  different  sizes.  As  an  objective  assessment,  the 
RMS  errors  between  the  original  and  compressed  images  were 
calculated.  Two  assessors  graded  image  quality  and  recorded 
their  observations  of  clinically  significant  abnormalities,  which 
were  compared  to  a  gold-standard. 

Results:  RMS  error  Increased  markedly  when  images  were 
compressed  beyond  about  20KB,  and  at  least  90%  of  the  Images 
qnallty  were  graded  as  being  of  good  quality  until  image  size 
went  below  15KB.  Agreement  with  the  gold-standard  in  the 
identification  of  abnormalities  was  >70%  for  the  two  assessors, 
but  there  was  a  wide  range  of  sensitivity  and  specificity  valnes. 
In  both  cases  there  was  no  relationship  with  the  level  of  image 
compression. 

Conclusion:  Images  of  the  eardrum  can  be  compressed  using 
JPEG  to  about  20KB  before  image  quality  is  affected,  but 
further  studies  require  a  higher  quality  of  original  images  and  a 
well  understood  protocol  for  assessment  before  further 
conclusions  can  be  made  on  effect  of  image  compression  on  the 
ability  to  detect  clinically  significant  abnormalities. 

Keywords  -  Telemedicine,  otolaryngology,  image  compression, 
eardrum 

1.  Introduction 

The  otoscope  is  one  of  the  basic  tools  of  doctors  and  allied 
health  workers  like  audiologists  and  nurses.  It  projects  light 
into  the  ear  canal  and  presents  a  view  of  the  eardrum  and  the 
ear  canal,  allowing  assessment  of  the  outer  and  middle  ear. 
Otoscopes  can  also  be  supplied  with  a  video  camera  and 
associated  image-digitising  equipment.  Doctors  can  make  use 
of  these  images  for  diagnosis  and  documenting  the 
progression  of  ear  disease. 

Just  as  in  other  areas  of  medicine  such  as  dermatology, 
ophthalmology  and  radiology,  the  accessibility  and 
affordability  of  imaging  and  computer  equipment  is  now 
making  telemedicine  an  attractive  method  for  the  delivery  of 
otology  health  care  to  people  in  rural  and  remote  area.  In 
many  countries,  as  is  the  case  in  Australia,  these  areas  are 
often  severely  under-serviced  for  specialist  medical  care. 
Large  population  centres  are  many  hundreds  of  kilometres 
away  from  resident  otology  and  audiological  services,  and  the 
task  falls  on  local  general  practitioners  and  health  workers  to 
provide  the  primary  care  for  these  and  other  medical 
specialties. 

Diagnosis  and  treatment  of  most  ear  disorders  requires  a 
clinical  history  and  otoscopic  examination  with  audiology  if 
hearing  loss  is  present.  Common  conditions  such  as  otitis 


media  and  glue  ear  can  be  diagnosed  utilising  images  of  the 
ear,  and  telemedicine  is  a  prime  candidate  for  playing  a  role 
in  improving  the  delivery  of  some  aspects  of  ear  health. 
Store-and-forward  techniques  are  suitable  for  this,  where 
images  are  captured  by  suitable  devices,  stored  and  then 
transmitted  via  the  communication  networks,  to  an  ear 
specialist  for  assessment.  However,  as  uncompressed  images 
are  usually  well  over  1MB  in  size,  they  take  a  long  time  to 
transmit  through  the  telecommunication  networks  found  in 
rural  and  remote  areas.  These  are  often  slow  and  unreliable. 
Therefore,  to  make  image  transmission  practical,  image 
compression  is  essential,  especially  as  image  numbers 
increase. 

The  most  popular  algorithm  used  in  image  compression  is 
the  one  developed  by  the  Joint  Picture  Expert  Group  (JPEG), 
which  has  been  deployed  in  almost  all  imaging  programmes 
and  also  used  in  medical  imaging.  The  algorithm  breaks  the 
images  into  8  pixel  by  8  pixel  blocks,  converts  the 
information  in  the  block  into  the  frequency  domain,  and 
removes  some  of  the  higher  frequency  information  depending 
on  the  desired  level  of  compression.  The  remaining 
information  is  coded  and  then  compacted  to  remove 
redundancy.  It  is  especially  suitable  for  natural  image  scenes, 
which  includes  medical  images. 

Yogesan  and  colleagues  [1]  reported  transmission  times  of 
30  minutes  for  uncompressed  ophthalmic  images,  which 
could  be  reduced  to  some  minutes  with  JPEG  compression. 
Other  studies  into  the  compression  of  medical  images  have 
also  been  reported.  [2-5]  While  the  case  for  telemedicine  in 
otolaryngology  has  been  discussed  [6-8]  there  are  no 
published  studies  on  the  requirements  for  digitised  images.  In 
this  paper  we  describe  a  pilot  study  into  the  compression  of 
otology  images,  discuss  the  methods  used  for  assessment,  and 
propose  some  future  studies  in  this  area. 

II.  Methodology 

A.  Images 

The  Welch- Allyn  VDX  Video-Otoscope  was  used  to 
collect  the  views  of  15  eardrums  of  remote  area  patients  seen 
by  staff  of  the  TVW  Telethon  Institute  for  Child  Health 
Research,  Perth.  The  images  represented  a  cross  section  of 
various  ear  were  conditions.  They  were  stored  as  a  24-bit 
colour  uncompressed  bitmap  file,  and  saved  on  a  CDROM  for 
transport.  The  file  size  of  each  image  was  L44MB. 

B.  Image  compression 

Custom  written  software  was  used  to  compress  each  of  the 
images  with  the  JPEG  algorithm.  To  cover  a  range  of 
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compressions  the  following  Q  (Quality)  factors  were  used: 
100,  70,  60,  50,  40,  30,  20,  15  and  10.  This  produced  a  range 
of  image  sizes,  from  approximately  220KB  (Q=100)  to 
approximately  11KB  (Q=10).  At  Q=10,  the  blocking  artefact 
is  seen  clearly;  however,  it  is  important  to  cover  a  range  of 
compressions  levels,  from  where  there  will  he  no  noticeable 
effect  (Q=100)  through  to  where  compression  has  made  the 
image  unrecognisable  from  the  original. 

C.  Image  assessment  -  objective 

All  of  the  compressed  and  original  images  were  split  into 
their  three  colour  channels  -Red  (R),  Green  (G)  and  Blue  (B). 
A  custom  written  program  then  calculated  the  Root-Mean- 
Square  (RMS)  Error  between  the  original  images  and  the 
various  compressed  images  for  each  of  the  colour  channels; 
these  data  were  plotted  against  image  size. 

D.  Image  assessment  -  subjective 

One  otolaryngology  specialist  and  a  trainee  assessed  all  the 
original  images  and  a  subset  of  the  compressed  images.  The 
subset  were  those  produced  with  Q-settings  15,  20,  30,  40,  50 
and  60;  The  Q=100  images  were  still  too  large  to  have  a 
beneficial  impact  in  reducing  image  transmission  times,  and 
empirically  there  was  no  perceptible  difference  between  these 
and  the  original  images.  The  Q=10  images  were  of  extremely 
low  quality,  and  were  only  slightly  smaller  in  size  than  the 
Q=15  images. 

Before  this  assessment  took  place,  another  otolaryngologist 
assessed  all  the  original  images  to  determine  the  gold- 
standard  to  which  the  other  assessment  were  compared.  These 
observations  included  earwax,  discharge,  scarring  of  the 
eardrum,  and  perforation  of  the  eardrum  (see  table  1). 
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Table  1:  Gold-standard  assessments.  Image  quality  is  indicated  by 
E:  excellent,  G:  good,  P:  poor  or  U:  unsuitable.  Other  columns  headings: 
W:  wax  in  ear  canal,  D:  discharge,  S:  sand  in  ear  canal,  P:  perforation  of  the 
eardrum,  T:  tympanosclersis.  A:  atrophic  segment/retraction,  SOM:  serous 
otitis  media,  N:  normal  eardrum.  A  ‘Y’  indicates  the  presence  of  these. 

The  images  were  presented  in  a  random  order  on  a  computer 
monitor  to  the  two  assessors,  who  were  asked  to  grade  the 
image  quality  as  E  (excellent),  G  (Good),  P  (Poor)  or  U 
(Unsatisfactory)  for  making  a  clinical  assessment.  In  normal 
clinical  practise,  more  information  than  an  image  of  the 


eardrum  is  used  to  make  an  assessment  and  diagnosis.  In  this 
study,  the  assessor  was  asked  only  to  judge  the  suitability  of 
the  image  for  diagnosis. 

They  were  also  asked  to  record  their  observations  of  the  ear 
canal  and  the  eardrum.  They  were  presented  with  a  list  of 
pathology  and  abnormalities  they  may  or  may  not  see. 

Einally,  the  assessors  also  noted  the  ear  they  were  viewing 
(right  ear  or  left  ear),  and  were  also  given  the  opportunity  to 
record  other  observations. 

The  assessors  were  asked  not  to  go  back  to  previous 
recordings  and  observations  in  cases  where  they  may  have 
recognised  the  image,  nor  go  back  to  view  the  previous 
images.  This  was  designed  to  minimise  the  affect  of  memory. 

After  the  assessments  were  tabulated,  sensitivity  and 
specificity  were  calculated  to  compare  the  assessments  with 
the  gold-standard.  Sensitivity  is  a  measure  of  the  ability  to 
detect  correctly  the  presence  of  the  condition  or  abnormality, 
while  specificity  is  a  measure  of  the  ability  to  assess  correctly 
that  the  condition  or  abnormality  does  not  exist. 

III.  Results 

A.  Image  assessment  -  objective 

Figure  I  is  a  plot  of  RMS  Error  versus  image  size  after 
compression  of  the  red  channel  of  all  15  images,  showing  a 
sudden  increase  in  RMS  Error  as  the  image  size  falls  below 
about  20kB.  The  data  from  the  green  and  blue  channels  have 
a  similar  distribution  and  trend,  and  are  not  plotted.  The 
difference  in  the  effects  of  compression  on  the  three  colour 
channels  is  shown  in  Figure  2,  which  plots  the  average  image 
size  of  each  channel  for  each  Q -value,  against  the  average 
RMS  Error.  It  shows  that  the  RMS  Error  is  greatest  in  the  red 
channel  and  least  in  the  green  channel. 


RMS  Error  versus  Image  Size  -  Red  Channel 


Image  size  (bytes) 

Figure  1:  The  RMS  Error  for  each  compression  level;  15  images,  with  a 
power  series  curve  fit. 


B.  Image  assessment  -  subjective 

The  assessment  of  image  quality  after  image  compression  is 
summarised  in  figure  3;  the  left  column  shows  the  gold- 
standard  assessment.  It  shows  that  the  quality  was  in  most 
cases  graded  as  ‘Good’  or  better  until  the  two  highest 
compressions.  Figure  4  plots  the  percentage  of  ears  correctly 
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RMS  Error  for  Colour  Channels  versus  Compression 


Compression  (Q) 

Figure  2:  The  average  RMS  Error  +/-  standard  deviation  for  each  Q-value  for 
the  three  colour  channels,  where  the  image  size  at  each  Q-value  is  averaged. 


Image  quality  versus  Image  compression 
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Figure  3:  Image  quality  as  assessed  by  the  two  assessors  (averaged)  for 
the  different  compression  rates;  the  gold-standard  assessment  is  shown  in  the 
left  bar. 
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Figure  4:  Percentage  of  correct  identification  of  the  ear  (right  or  left)  by 
the  two  assessors  for  the  different  compression  rates. 


identified  (right  or  left)  by  the  assessors.  Figure  5  plots  the 
agreement  between  both  assessors  and  the  gold  standard 
assessment  of  each  of  the  conditions  as  shown  in  table  1 .  The 
sensitivity  and  specificity  for  assessor  1  only  is  plotted  in 
Figure  6  and  7;  sand  in  the  ear  canal  is  not  shown  as  no 
assessor  saw  this  in  any  of  the  images. 


Agreement  with  Gold  Standard  versus  compression 


Quality  (Q) 

Figure  5:  Agreement  (%)  between  the  assessors  and  the  gold  standard  on  the 
presence  or  absence  of  a  clinical  abnormality  in  the  image. 
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Figures  6  to  7:  Sensitivity  and  specificity  for  assessor  1  and  for  the  different 
compression  rates.  W:  wax,  D:  discharge,  P:  perforation, 

T:  typmanosclerosis,  R:  retraction,  S:  serous  otitis  media,  N:  normal. 


IV.  Discussion 

There  is  no  definitive  answer  as  to  what  RMS  Error  value 
is  indicative  of  poor  image  quality,  and  so  therefore  the 
results  can  be  used  only  to  compare  the  relative  effect  of 
compression  rate,  and  the  effect  of  compression  on  the  colour 
channels.  Figure  1  demonstrates  that  although  the  effect  on 
compression  on  each  image  is  different  (both  on  image  size 
and  RMS  error),  there  is  a  similar  trend  for  all  images,  with  a 
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sudden  rise  in  RMS  error  occurring  at  the  higher  compression 
rates.  It  would  appear  that  there  would  he  little  benefit  from 
compressing  an  image  heyond  about  20KB,  when  there  is  a 
sharp  increase  in  cost  of  image  quality. 

As  shown  in  Figure  2,  the  affect  of  compression  is  largest 
in  the  red  channel,  and  least  in  the  green  channel.  The  error 
bars  (+/-  standard  deviation)  indicate  a  statistical  difference 
between  the  results  for  the  red  and  green  channels,  but  not 
between  the  blue  channel  and  either  of  the  other  two 
channels.  The  reason  for  the  differing  effects  on  the  colour 
channels  is  yet  to  be  explored,  but  can  be  probably  be 
attributed  to  the  reflective  properties  of  the  various  structures 
recorded  on  the  image. 

As  can  be  expected  with  many  subjective  assessments,  the 
results  in  this  study  were  affected  by  the  differing 
interpretations  and  conceptions  of  various  parameters  and 
conditions  by  the  individual  assessors.  This  is  shown  clearly 
in  Figure  3,  where  the  assessor  for  the  gold  standard  graded 
almost  50%  of  the  original  images  as  being  of  poor  quality, 
whereas  the  other  assessors  found  that  the  majority  of  the 
images  were  of  good  quality  until  at  least  Q=20.  This 
indicates  that  the  assessors  had  a  different  standard  in  mind 
for  the  grading  the  quality  of  otology  images,  and  that  future 
studies  should  include  a  session  where  image  quality  is 
discussed  and  descriptions  are  standardised.  This  could 
include  the  development  of  a  set  of  criteria  for  factors  such  as 
the  plane  of  focus  and  which  features  are  clinically 
significant. 

There  is  a  poor  rate  of  correct  of  identification  of  which  ear 
was  being  viewed  (figure  4),  and  there  is  no  apparent 
relationship  to  the  level  of  image  compression.  In  a  few 
situations  (assessor  2,  Q=40,  20  and  15)  the  rate  was  actually 
worse  than  50%.  It  is  very  unlikely  that  image  compression 
actually  changes  the  image  so  much  that  it  appears  that  the 
opposite  ear  is  being  viewed.  A  poor  quality  image  may  cause 
misunderstanding  for  inexperienced  assessors,  even  which  ear 
is  being  viewed.  The  ear  is  orientated  at  a  lateral  to  medial 
angle  to  the  ear  canal  from  posterior  to  anterior,  allowing 
determination  of  side. 

The  plots  of  agreement  (Figure  5)  and  sensitivity  and 
specificity  (Figures  6  and  7)  do  not  show  any  relationship 
between  the  ability  to  detect  the  presence  or  absence  of  an 
abnormality  and  the  degree  of  image  compression.  The 
agreement  is  about  80%  for  assessor  1  and  70%  for  assessor 

2.  In  most  cases  sensitivity  and  specificity  remain  relatively 
unchanged.  However,  sensitivity  (the  percentage  of  those 
with  the  condition  positively  identified  by  the  assessor)  is 
poor  in  most  cases.  On  the  other  hand  specificity  (the 
percentage  of  the  ears  without  the  condition  correctly 
identified  by  the  assessor)  is  very  good  in  most  cases;  only 
for  serous  otitis  media  and  a  normal  eardrum  was  specificity 
poor.  These  data  indicate  that  the  assessors  often  missed  the 
presence  of  a  condition,  and  when  taken  with  the  knowledge 
that  the  gold-standard  image  quality  was  poor  for  almost  50% 
of  the  images,  it  can  be  concluded  that  image  quality  masked 
the  presence  of  the  conditions.  Other  factors  may  also  have 


contributed  such  as  the  memory  effect,  or  the  experience  of 
the  assessor. 

V.  Conclusion 

1 .  Images  can  probably  be  compressed  to  about  20KB  from 
1.44MB  using  JPEG  compression,  although  further  study  of 
what  clinically  significant  features  and  abnormalities  are  lost 
with  image  compression  is  required. 

2.  A  good  protocol  for  image  assessment,  with  a  session  for 
assessors  to  standardise  their  definitions,  is  necessary  for 
subjective  assessments  of  the  otology  images. 

3.  Only  images  that  can  be  graded  a  good  or  excellent  should 
be  used  as  an  original  set  of  images  for  future  studies. 

4.  Further  research  is  warranted  into  the  reason  for 
differential  compression  of  the  different  colour  channels. 
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